```html
<!DOCTYPE html>
<html lang="zh-CN">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Elasticsearch 分词器详解</title>
    <link href="https://cdn.staticfile.org/font-awesome/6.4.0/css/all.min.css" rel="stylesheet">
    <link href="https://cdn.staticfile.org/tailwindcss/2.2.19/tailwind.min.css" rel="stylesheet">
    <link href="https://fonts.googleapis.com/css2?family=Noto+Serif+SC:wght@400;500;600;700&family=Noto+Sans+SC:wght@300;400;500;700&display=swap" rel="stylesheet">
    <script src="https://cdn.jsdelivr.net/npm/mermaid@latest/dist/mermaid.min.js"></script>
    <style>
        body {
            font-family: 'Noto Sans SC', Tahoma, Arial, Roboto, "Droid Sans", "Helvetica Neue", "Droid Sans Fallback", "Heiti SC", "Hiragino Sans GB", Simsun, sans-serif;
            background-color: #f8fafc;
            color: #1e293b;
        }
        .hero {
            background: linear-gradient(135deg, #4f46e5 0%, #7c3aed 100%);
        }
        .card {
            transition: all 0.3s ease;
            box-shadow: 0 4px 6px rgba(0, 0, 0, 0.05);
        }
        .card:hover {
            transform: translateY(-5px);
            box-shadow: 0 10px 15px rgba(0, 0, 0, 0.1);
        }
        .section-title {
            position: relative;
            padding-left: 1.5rem;
        }
        .section-title:before {
            content: '';
            position: absolute;
            left: 0;
            top: 50%;
            transform: translateY(-50%);
            width: 0.5rem;
            height: 1.5rem;
            background-color: #4f46e5;
            border-radius: 0.25rem;
        }
        .analyzer-icon {
            font-size: 2.5rem;
            color: #4f46e5;
        }
        .footer {
            background-color: #1e293b;
        }
        .footer a:hover {
            color: #818cf8;
        }
    </style>
</head>
<body class="min-h-screen flex flex-col">
    <!-- Hero Section -->
    <header class="hero text-white py-20 px-4 md:px-0">
        <div class="container mx-auto max-w-6xl text-center">
            <h1 class="text-4xl md:text-5xl font-bold mb-6 font-serif">Elasticsearch 分词器详解</h1>
            <p class="text-xl md:text-2xl mb-8 max-w-3xl mx-auto leading-relaxed">
                深入探索 Elasticsearch 核心分词技术，了解不同分词器的工作原理与应用场景
            </p>
            <div class="flex justify-center space-x-4">
                <a href="#content" class="px-6 py-3 bg-white text-indigo-600 rounded-lg font-medium hover:bg-gray-100 transition-colors">
                    开始阅读
                </a>
            </div>
        </div>
    </header>

    <!-- Main Content -->
    <main id="content" class="flex-grow container mx-auto max-w-6xl px-4 py-12">
        <!-- Overview Section -->
        <section class="mb-16">
            <div class="bg-white rounded-xl p-6 md:p-8 card">
                <div class="flex items-center mb-6">
                    <i class="fas fa-brain analyzer-icon mr-4"></i>
                    <h2 class="text-2xl font-bold">什么是分词器？</h2>
                </div>
                <p class="text-gray-700 leading-relaxed mb-4">
                    分词器是 Elasticsearch 中用于将文本转换为词项(token)的组件，它决定了文档如何被索引和搜索。一个完整的分词器通常包含三个部分：字符过滤器(Character Filters)、分词器(Tokenizer)和词项过滤器(Token Filters)。
                </p>
                <div class="mermaid mt-6">
                    graph LR
                    A[原始文本] --> B[字符过滤器]
                    B --> C[分词器]
                    C --> D[词项过滤器]
                    D --> E[最终词项]
                </div>
            </div>
        </section>

        <!-- Analyzers Section -->
        <section class="mb-16">
            <h2 class="text-3xl font-bold mb-8 section-title">分词器类型</h2>
            
            <!-- Standard Analyzer -->
            <div class="bg-white rounded-xl p-6 md:p-8 mb-8 card">
                <div class="flex items-center mb-4">
                    <i class="fas fa-star analyzer-icon mr-4"></i>
                    <h3 class="text-xl font-bold">1. Standard Analyzer</h3>
                </div>
                <p class="text-gray-700 mb-4">
                    标准分词器，也是 Elasticsearch 的默认分词器，按词切分，小写处理，非字母会处理。
                </p>
                <div class="bg-gray-100 p-4 rounded-lg mb-4">
                    <img src="https://cdn.nlark.com/yuque/0/2021/png/21449790/1636547100493-400430ea-c47c-455c-80ae-59a9ccb74b5f.png" 
                         alt="Standard Analyzer示例" 
                         class="w-full rounded">
                </div>
                <div class="bg-blue-50 p-4 rounded-lg">
                    <h4 class="font-medium text-blue-800 mb-2"><i class="fas fa-lightbulb mr-2"></i>特点</h4>
                    <ul class="list-disc pl-5 text-blue-700">
                        <li class="mb-1">默认分词器</li>
                        <li class="mb-1">基于Unicode文本分割算法</li>
                        <li class="mb-1">移除标点符号</li>
                        <li>所有词项转为小写</li>
                    </ul>
                </div>
            </div>
            
            <!-- Simple Analyzer -->
            <div class="bg-white rounded-xl p-6 md:p-8 mb-8 card">
                <div class="flex items-center mb-4">
                    <i class="fas fa-sim-card analyzer-icon mr-4"></i>
                    <h3 class="text-xl font-bold">2. Simple Analyzer</h3>
                </div>
                <p class="text-gray-700 mb-4">
                    按照非字母切分，非字母都会被去除，即只处理字母，小写处理。
                </p>
                <div class="bg-gray-100 p-4 rounded-lg mb-4">
                    <img src="https://cdn.nlark.com/yuque/0/2021/png/21449790/1636547215527-34476e16-61b3-4395-9d6a-0f67b27260b0.png" 
                         alt="Simple Analyzer示例" 
                         class="w-full rounded">
                </div>
                <div class="bg-blue-50 p-4 rounded-lg">
                    <h4 class="font-medium text-blue-800 mb-2"><i class="fas fa-lightbulb mr-2"></i>特点</h4>
                    <ul class="list-disc pl-5 text-blue-700">
                        <li class="mb-1">仅保留字母字符</li>
                        <li class="mb-1">遇到非字母字符时切分</li>
                        <li>所有词项转为小写</li>
                    </ul>
                </div>
            </div>
            
            <!-- Whitespace Analyzer -->
            <div class="bg-white rounded-xl p-6 md:p-8 mb-8 card">
                <div class="flex items-center mb-4">
                    <i class="fas fa-space-shuttle analyzer-icon mr-4"></i>
                    <h3 class="text-xl font-bold">3. Whitespace Analyzer</h3>
                </div>
                <p class="text-gray-700 mb-4">
                    按照空格切分。
                </p>
                <div class="bg-gray-100 p-4 rounded-lg mb-4">
                    <img src="https://cdn.nlark.com/yuque/0/2021/png/21449790/1636547277475-94e98825-234c-4e24-99da-84fd4f7f46c2.png" 
                         alt="Whitespace Analyzer示例" 
                         class="w-full rounded">
                </div>
                <div class="bg-blue-50 p-4 rounded-lg">
                    <h4 class="font-medium text-blue-800 mb-2"><i class="fas fa-lightbulb mr-2"></i>特点</h4>
                    <ul class="list-disc pl-5 text-blue-700">
                        <li class="mb-1">仅按空格切分</li>
                        <li class="mb-1">不改变原始大小写</li>
                        <li>不删除任何字符</li>
                    </ul>
                </div>
            </div>
            
            <!-- Stop Analyzer -->
            <div class="bg-white rounded-xl p-6 md:p-8 mb-8 card">
                <div class="flex items-center mb-4">
                    <i class="fas fa-stop-circle analyzer-icon mr-4"></i>
                    <h3 class="text-xl font-bold">4. Stop Analyzer</h3>
                </div>
                <p class="text-gray-700 mb-4">
                    相对于Simple Analyzer多了stop filter，会把is，a，the等无语义的词去除，即含有停用词。
                </p>
                <div class="bg-gray-100 p-4 rounded-lg mb-4">
                    <img src="https://cdn.nlark.com/yuque/0/2021/png/21449790/1636547457950-80f26ef1-d884-4e88-a9c6-c9ef9f4b429b.png" 
                         alt="Stop Analyzer示例" 
                         class="w-full rounded">
                </div>
                <div class="bg-blue-50 p-4 rounded-lg">
                    <h4 class="font-medium text-blue-800 mb-2"><i class="fas fa-lightbulb mr-2"></i>特点</h4>
                    <ul class="list-disc pl-5 text-blue-700">
                        <li class="mb-1">包含停用词过滤</li>
                        <li class="mb-1">移除常见功能词</li>
                        <li>支持自定义停用词列表</li>
                    </ul>
                </div>
            </div>
            
            <!-- Keyword Analyzer -->
            <div class="bg-white rounded-xl p-6 md:p-8 mb-8 card">
                <div class="flex items-center mb-4">
                    <i class="fas fa-key analyzer-icon mr-4"></i>
                    <h3 class="text-xl font-bold">5. Keyword Analyzer</h3>
                </div>
                <p class="text-gray-700 mb-4">
                    不分词，直接将输入的文档当做一个词输出。
                </p>
                <div class="bg-gray-100 p-4 rounded-lg mb-4">
                    <img src="https://cdn.nlark.com/yuque/0/2021/png/21449790/1636547495540-96f7af54-0e46-4251-95f4-85f7ff7e9db2.png" 
                         alt="Keyword Analyzer示例" 
                         class="w-full rounded">
                </div>
                <div class="bg-blue-50 p-4 rounded-lg">
                    <h4 class="font-medium text-blue-800 mb-2"><i class="fas fa-lightbulb mr-2"></i>特点</h4>
                    <ul class="list-disc pl-5 text-blue-700">
                        <li class="mb-1">不进行任何分词处理</li>
                        <li class="mb-1">整个输入作为一个词项</li>
                        <li>适用于精确匹配场景</li>
                    </ul>
                </div>
            </div>
            
            <!-- Chinese Analyzers -->
            <div class="mb-8">
                <h3 class="text-2xl font-bold mb-6 section-title">中文分词器</h3>
                
                <!-- ICU Analyzer -->
                <div class="bg-white rounded-xl p-6 md:p-8 mb-8 card">
                    <div class="flex items-center mb-4">
                        <i class="fas fa-language analyzer-icon mr-4"></i>
                        <h4 class="text-xl font-bold">6. ICU Analyzer</h4>
                    </div>
                    <p class="text-gray-700 mb-4">
                        中文分词器，由于提供了Unicode编码支持，能够更好地支持中文。需要通过插件安装才可以使用。
                    </p>
                    <div class="bg-blue-50 p-4 rounded-lg">
                        <h4 class="font-medium text-blue-800 mb-2"><i class="fas fa-lightbulb mr-2"></i>特点</h4>
                        <ul class="list-disc pl-5 text-blue-700">
                            <li class="mb-1">基于ICU库实现</li>
                            <li class="mb-1">支持多种语言</li>
                            <li class="mb-1">Unicode标准支持</li>
                            <li>需要安装插件</li>
                        </ul>
                    </div>
                </div>
                
                <!-- IK Analyzer -->
                <div class="bg-white rounded-xl p-6 md:p-8 mb-8 card">
                    <div class="flex items-center mb-4">
                        <i class="fas fa-hanukiah analyzer-icon mr-4"></i>
                        <h4 class="text-xl font-bold">7. IK Analyzer</h4>
                    </div>
                    <p class="text-gray-700 mb-4">
                        中文分词器，使用较多，需要通过插件安装才可以使用。
                    </p>
                    <div class="bg-gray-100 p-4 rounded-lg mb-4">
                        <img src="https://cdn.nlark.com/yuque/0/2021/png/21449790/1636547692445-ae9c20d5-86ea-4441-9077-58dddad644f5.png" 
                             alt="IK Analyzer示例" 
                             class="w-full rounded">
                    </div>
                    <div class="bg-blue-50 p-4 rounded-lg">
                        <h4 class="font-medium text-blue-800 mb-2"><i class="fas fa-lightbulb mr-2"></i>特点</h4>
                        <ul class="list-disc pl-5 text-blue-700">
                            <li class="mb-1">专为中文设计</li>
                            <li class="mb-1">支持智能分词和细粒度分词</li>
                            <li class="mb-1">支持扩展词典</li>
                            <li>社区活跃，更新频繁</li>
                        </ul>
                    </div>
                </div>
            </div>
        </section>

        <!-- Summary Section -->
        <section class="mb-16">
            <div class="bg-white rounded-xl p-6 md:p-8 card">
                <h2 class="text-2xl font-bold mb-6 flex items-center">
                    <i class="fas fa-clipboard-list analyzer-icon mr-4"></i>
                    <span>分词器选择指南</span>
                </h2>
                <div class="mermaid">
                    flowchart TD
                    A[需要中文处理?] -->|是| B[IK或ICU Analyzer]
                    A -->|否| C[需要完全保留原始输入?]
                    C -->|是| D[Keyword Analyzer]
                    C -->|否| E[需要过滤停用词?]
                    E -->|是| F[Stop Analyzer]
                    E -->|否| G[需要简单分词?]
                    G -->|是| H[Simple Analyzer]
                    G -->|否| I[需要按空格分词?]
                    I -->|是| J[Whitespace Analyzer]
                    I -->|否| K[Standard Analyzer]
                </div>
                <div class="mt-6 bg-indigo-50 p-4 rounded-lg">
                    <h3 class="font-medium text-indigo-800 mb-2"><i class="fas fa-check-circle mr-2"></i>最佳实践</h3>
                    <ul class="list-disc pl-5 text-indigo-700">
                        <li class="mb-1">中文内容优先选择IK Analyzer</li>
                        <li class="mb-1">精确匹配使用Keyword Analyzer</li>
                        <li class="mb-1">英文内容通常使用Standard Analyzer</li>
                        <li>根据业务需求可能需要自定义分词器</li>
                    </ul>
                </div>
            </div>
        </section>
    </main>

    <!-- Footer -->
    <footer class="footer text-white py-8">
        <div class="container mx-auto max-w-6xl px-4 text-center">
            <div class="mb-4">
                <h3 class="text-lg font-medium">技术小馆</h3>
            </div>
            <div>
                <a href="http://www.yuque.com/jtostring" class="text-gray-300 hover:text-white transition-colors">
                    <i class="fas fa-globe mr-2"></i>http://www.yuque.com/jtostring
                </a>
            </div>
        </div>
    </footer>

    <script>
        mermaid.initialize({
            startOnLoad: true,
            theme: 'default',
            flowchart: {
                useMaxWidth: true,
                htmlLabels: true,
                curve: 'basis'
            }
        });
    </script>
</body>
</html>
```